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0^ The Quermass model is a generalization of the classical germ-grain Boolean model for 

which a morphological interaction is added. It allows to model random structures with 
^ specific morphologies which are unlikely to come from a Boolean model. The Quermass 

^vq model depends on three interacting parameters and on an intensity parameter. Since the 

^— I number of points is not observable from a germ-grain set, the estimation of all parameters 

J> is not possible from classical likelihood or pseudo-likelihood approaches. In this paper, we 

kj> present a procedure based on the Takacs-Fiksel method which is able to fit all parameters 

of the Quermass model, included the intensity. An intensive simulation study is conducted 
^ to assess the efficiency of the procedure and to provide practical recommendations. An 

application to heather data, initially studied by P. Diggle, is finally proposed. 



KEY-WORDS: Gibbs Point Process, germ-grain model, Quermass-Interaction Process, 
Area-Process, Perimeter-Process, Takacs-Fiksel estimator. 
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1 Introduction 



Physics, biology or agronomy are often confronted with problems involving complex ran- 
dom sets like liquid-vapor interface structures, micro-emulsions, porous media or propaga- 
tion domains of plants. Admissible models for these random structures are the germ-grain 
models, which consist in the union of locally finite random sets called grains. The most 
popular germ-grain model is certainly the Poisson Boolean model of balls. In this model, 
the grains correspond to balls, whom centers are distributed as a Poisson point process 
and the radii are independently and identically distributed. The probabilistic and statisti- 
cal properties of the Boolean model have been widely studied (see for example [16], [27]). 
However, the variety of morphological structures induced by the Boolean model is limited. 
To reach more realistic morphologies, some Gibbs modifications of the Boolean model have 
been developed. The rough idea is to introduce an interaction (or Hamiltonian) acting on 
the morphology of the random set. This leads to a new random structure which tends to 
minimize this Hamiltonian. In this way, some morphological features are more likely to 
occur than for Boolean sets. 

A first morphological interaction based on the area of the germ-grain structure has been 
introduced in [30] to model phase transition phenomena in statistical mechanics. More 
complex interactions appeared later in the nineties [H [T31 EH [l5j . Hadwiger's theorem jlO] 
ensures, under mild conditions, that any function acting on an union of compact convex sets 
can be decomposed into a linear combination of the Minkowski (or Quermass) functionals. 
These functionals correspond in to the area, the perimeter and the Euler-Poincare 
characteristic (number of connected components minus number of holes). Accordingly, 
any morphological interaction in can be written as a linear combination of these three 
functionals. The Quermass model corresponds to the Gibbs modification of the Boolean 
model based on such linear interactions. From Hadwiger's theorem, it appears as a very 
rich model to represent random structures. 

The Quermass model in depends on the law of the radii and on four parameters: 
the three coefficients in the linear combination defining the Hamiltonian, and the intensity 
parameter of the underlying Boolean model. The present paper deals with the estimation 
of these four parameters. 

The main difficulty comes from the nature of the observable data, which are the germ- 
grain set. In particular the positions and the number of balls are not observed and can not 
be used in the statistical procedure, which is untypical in estimation problems for Gibbs 
point processes. Note that this specific issue already occurs for the estimation of the 
intensity parameter of the classical Boolean model. In this case, some explicit estimating 
equations have been found, that express the intensity parameter in terms of the specific 
volume of the set, see |16| . In presence of Gibbs interactions as for the Quermass model, 
it is well-known that the computation of macroscopic quantities is intractable, so a similar 
explicit estimation procedure is not possible. 

Assuming the intensity parameter of the underlying Boolean model is known, a maxi- 
mum likelihood approach is investigated in [18j for the estimation of the three parameters 
of the Quermass interaction. Unfortunately the intensity parameter can not be fitted by 
this method due to the unobservability of the number of points. Section |3.1| gives more 
details about this procedure and explain the serious consequences of a misspecification of 
the intensity parameter in practice. In this paper, we fit all parameters of the Quermass 
process, including the intensity, via a Takacs-Fiksel procedure (|28). jTj, |3]). The Takacs- 
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Fiksel contrast function is based on an empirical counterpart of the Georgii-Nguyen-Zessin 
equilibrium equation (2.3), and depends on the choice of some test functions. Due to the 
unobservability of points, the contrast function is not computable in general, but some 
specific choices of test functions overcome this issue (see Section 3.2.2). This is the key 
ingredient for the estimation of the Quermass model by the Takacs-Fiksel procedure. The 
purpose of the present paper is to develop in details this method, to provide a simulation 
study and to deduce some practical recommendations. 

As an application, we finally fit a Quermass model to heather data. Heather dataset 
has been initially studied by P. Diggle in |6], followed by many studies |12) . j^, |19] . 

|18)). We show that our fitted model seems a better approximation of heather dataset, 
both from a visual impression and from a stastical diagnostic inspection. 

In Section [2] we introduce the Quermass model and we recall the fundamental Georgii- 
Nguyen-Zessin equation. Section |3] presents the estimation procedures. The limitation in 



using the maximum likelihood approach is explained in Section |3.1| Then in Section 3.2 



we present the general Takacs Fiksel procedure and its application to the Quermass model. 
Practical aspects for the implementation of the procedure are given in Section [4j A simula- 
tion study assessing the efficiency of the procedure is presented in Section|5] An application 
to heather dataset is conducted in Section [6] Note finally that an appealing alternative 
estimation procedure would consist in combining the likelihood and the Takacs-Fiksel ap- 
proaches, to take advantages of both methods. Following this idea, we have developed a 
mixed procedure. But it turns out to be very time consuming without being more efficient. 
Its presentation is postponed to an appendix. 



2 Quermass Model 
2.1 Notations 

We denote by E the space x [0, i?o] (where i?o > is a fixed positive real number) 
endowed with its natural Euclidean Borel cr-algebra. It is the space of marked points 
(x, i?) where x E M'^ is the centre of a ball and K its radius. We assume for simplicity 
that the radii are uniformly bounded by Kq. This assumption allows to define easily 



the Quermass model on the full plane M , see section 2.3 though this restriction is not 
mandatory (see [5]). For any bounded set A in M^, we denote by iSa := A x [0, i?o] the 
restriction of E to A. 

By definition, a configuration of points a; is a locally finite subset of iS, which means that 
the set WA := w H f a is finite for any bounded set A in M^. The space of all configurations 
of points in E is denoted by 17, while for any bounded set A in M^, J^a denotes the subspace 
of configurations included in Et^. 

For X G M^, we write for short x G a; if there exists R € [0, iio] such that (x, K) G uj. For 
(x, i?) G I? we write a;U (x, i?) in place of w U {(x, i?)} and a;\(x, i?) in place of a;\{(x, i?)}. 

For any configuration a; we denote by lAij^ its germ-grain representation defined by the 
following set 

U^:= U B(x,i?), 
where B{x,R) is the closed ball centered at x with radius R. 
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Let fi he a reference probability measure on [0,i2o]- We denote by A the Lebesgue 
measure on and by tt^ the marked Poisson process on £ with intensity measure X0 ji. 
For every bounded set A, the probability measure vr^ denotes the marked Poisson process 
on £\ with intensity measure Aa (8) fJ-- Recall that the law of the random set under 
the probability measure tt^ is nothing else than the standard homogeneous Boolean model 
with intensity one and distribution of radii fi. 



2.2 Quermass model on a bounded window A 

Following Kendall et al. [l3J, for any configuration uj\ in a bounded window A in M?, the 
Quermass-interaction (or Quermass Hamiltonian) is defined by 

H'M = 01 A{U^^) + 02 CiU^J + 03 x(^c.J, (2.1) 

where := (01,02,03) is a vector of real parameters. The functionals A, C and x are 
the three fundamental Minkowski (or Quermass) functionals: area, perimeter and Euler- 
Poincare characteristic (number of connected components minus the number of holes). 

From Hadwiger's Theorem |10| . any additive functional F defined on the space of finite 
unions of convex compact sets (i.e. ^(^4 U B) = F{A) + F[B) — F{A n B)) and satisfying 



some continuity assumption (see [TU]) can be decomposed as in (2.1). This universal 
representation explains the interest of the Quermass interaction for morphological model 
at mesoscopic scale |14| [T5] . 

Definition 2.1. The Quermass point process on a hounded set A in for the parameter 

Z 

0, the intensity z > and the distribution of radius /i is the probability measure P^' on Q\ 
which is absolutely continuous with respect to the marked Poisson Process vr^ with density 

where Z/^{z,0) := J z'"'^'^^^e~^^^^^''7rj!^{duJA) is a normalizing constant called the partition 
function. 

Some simulations are displayed in Figure [T] They correspond to Quermass models 
involving only one non-null interacting parameter: the Area process (02 = 03 = 0), the 
Perimeter process (^i = 0^ = 0), and the Euler-Poincare process (^i = ^2=0). These 
extreme situations show the rich variety of random sets that Quermass processes can 
provide. Note that in the three situations displayed in Figure [T| the interacting parameter 
is positive, so that the resulting random set is more likely to induce a lower Minkowski 
functional (resp. A, C or x) than in the Boolean case. These simulations have been done 
by a birth-death Metropolis-Hasting algorithm as presented in |17| . 



2.3 The Gibbs property: Quermass model on the full plane 

In this section the Gibbs (or Markov) property of the Quermass model is displayed via the 
Georgii-Nguyen-Zessin (GNZ) equation. Another equivalent and certainly more natural 
presentation could have been considered via the DLR equations. But the latter point view 
requires to introduce more technical materials, that we do not use in this work, so we 
decided to focus on the GNZ perspective. We refer to [SI ESj [25] for details about the 



4 






Figure 1: Samples of: the Area process with z = 0.1, Q\ — 0.2 (left); the Perimeter process with 
z = 0.2, 02 = 0.4 (middle); the Euler-Poincare process with z = 0.1, 63 = 1 (right). The window 
is [0,50]^ and the radii are uniformly distributed on [.5,2]'^. 



general theory of Gibbs point processes with, in particular, a presentation via the DLR 
equations. 

First, let us define the local energy of a marked point (x, R) with respect to a configu- 
ration u! by the following expression 

h\(x,R),uj) := H^uaU {x,R)) - H'^iujA), (2.2) 

where A is any bounded set containing the ball B(x, 2Rq). Thanks to the additivity of the 
functionals A, C and Xi this local energy does not depend on the choice of such A, so we 
can simply choose A = B(x, 2Rq). Note that the local energy is related to the Papangelou 
intensity 

m ^ gA(c^A U {x,R);z,e) 
A {[x,R),u}) ■= , 

£/a(wa;2;,6') 

where A satisfies the same condition as above, by \*{{x,R),U}) = exp(— /i^((x, i?), w)). 

We have the following characterisation of the Quermass process via the Georgii-Nguyen- 
Zessin equation (GNZ equation). 

Proposition 2.2 (Georgii [8j, Nguyen-Zessin [21j). For any hounded set A, a probability 
measure P on JIa is the Quermass point process on A for the parameter 9, the intensity 
z > and the distribution of radius p, (i.e. P = P^'^ ) if and only if for any non-negative 
function f from J^a x <£" to M 



^ ( ((^' '^a\(x, R))j =e(^J^' J^z e-'»''((-.«).-A)/ ((x, R),ojj,) dx /.(di?)) , 

(2.3) 



where E denotes the expectation with respect to P 



The GNZ-equation involves the expectation under the Quermass process of two com- 
pletely different expressions. This equilibrium equation is the starting point of the Takacs- 



Fiksel estimation procedure presented in Section 3.2 
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In the present paper, we mainly consider Quermass processes on bounded windows as 



presented in definition 2.1 However, it is necessary to consider Quermass processes on the 
full plane for questions involving asymptotic properties of estimators (as consistency 



an infinite configuration uj in 
no sense in this case. A definition of the Quermass process on 



or asymptotic normality). This extension is not trivial since the energy H (uj) in (2.1) of 

l'^ does not exist in general and so the definition 2.1 makes 

i'^ is possible thanks to the 



GNZ-equation (2.3). 



Definition 2.3. A probability measure P^'^ on Q is a Quermass point process on for 
the parameter 9, the intensity z > and the distribution of radius fi if for any non-negative 
function f from x £ to M., for any bounded set A inM."^, 



E 



f{ix,R),co\ix,R)) 



E 



Ro 



ze 



-h'H^,R),^)f(^^x,R),uj)dx fiidR) 



where E denotes the expectation with respect to 



The existence of a measure P^' satisfying definition 2.3 



its uniqueness or non-uniqueness 
(i.e. the occurence of phase transition) are difficult problems of statistical physics. The 
existence has been proved recently for any z > and any € in [5]. The uniqueness of 
P^'^ depends on the values of the parameters and very few is known about this issue. For 
the area process (i.e. 62 = 0^ = 0) and when the radii are not random (i.e. //({Pq}) = 1); 
a phase transition has been proved to occur in the very particular case where 9i = z and 
z is large enough (|2l [30]). 



3 The estimation procedures 

Let us consider a realisation a; of a Quermass point process p^*'^* defined on A', where 
A' C M^, and where z* and 9* are unknown. In practice, one only observes the random set 
U;^ n A in a bounded window A C A', and not uj\. So the positions of the marked points 
(x, R) in UJA are unknown. A challenging task is then to estimate the parameters z* > 
and 9* = (0J, ^3) ™ presence of this unobservability issue, which is an unusual problem 
of inference for Gibbs point processes. The first subsection explains the limitation in using 
the maximum likelihood procedure (MLE). The second subsection focuses on the main 
approach of this paper which is the Takacs-Fiksel (TF) method. An attempt to combine 
MLE and TF procedures is displayed in appendix. We assume all along this paper that 
the distribution of the radii /x is known. 

3.1 The maximum likelihood approach 

When z* is known, the classical maximum likelihood approach let the estimation of the 
parameter 9* possible. This has been investigated in |il8j . where the authors introduce an 
original procedure based only on the connected components that are completely included 
in the observation window. The boundary effects are then limited. However, the maximum 
likelihood procedure does not allow us to fit the intensity parameter z* since the number 
of points is unobservable. If z* is unknown, a two step procedure is proposed in [18]: first 
estimating z* as if the observable data come from a Boolean model (various methods are 
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available in this setting), then, in a second step, applying the MLE procedure to estimate 

e*. 

Unfortunately, it seems that this two step procedure induces a strong restriction on the 
possible values for the interacting parameter 9*. Indeed, in the first step, the estimation z is 
chosen such that the Boolean model with intensity z has the same specific volume than the 
observed real data. Similarly, in the second step, the MLE estimation of 9* ensures that the 
corresponding Quermass process has the same specific volume than the real data. So, this 
two step procedure concerns Quermass processes with intensity parameter z, that exhibit 
the same specific volume as the Boolean model with intensity z. This turns out to be a 
strong restriction. For example, it is well-known that the Area process (i.e. ^2 ~ ^3 ~ 0) 
with parameters z* and 0* > is strictly stochastically dominated by the Boolean model 
with intensity z* [9j, which implies a strict comparison of their specific volumes. For this 
example, the lonely possible value of 91 following the previous restriction is thus 91 = 0. 



3.2 The Takacs-Fiksel approach 

The possibility to use the Takacs-Fiksel procedure for fitting all parameters of the Quermass 
process, including z* , has been recently emphasized in [3]. In this section we recall the 
procedure, which depends on the choice of some test functions. Next we display various 
test functions, which allow to overcome the unobservability issue of the Quermass model. 



3.2.1 The general procedure 

Consider, for any non-negative function / from Q x if to R, the random variable 
Cfiu;;f)= V f{{x,R),u;\ix,R))- ^ [ z e-'^'«^-'^)''^)/((x, i?), .;)dx /i(di?). (3.1) 

From the ergodic theorem, if A is large enough, C^^{lo; /)/|A| is approximatively equal 
to E^t.e* (Cjg ^2 ('^j/))) where Ej;*,e* is the expectation with respect to p^*fi* , Moreover, 



from the GNZ equation (2.3) it is easy to show that "K^.* fi*{C^Q ('^i /)) = 0. Therefore 



for any function /, the random variable in (3.1) should be close to zero when z = z* and 



= 9* . This remark is the basis of the Takacs-Fiksel approach. 

Let us give K functions {fk)i<k<K, the Takacs-Fiksel estimator is simply defined by 



K 



{z,9):=aTgminJ2(cf{oo;fk)] • (3.2) 



k=l 



The strong consistency and asymptotic normality of (z, 9) when the observation window 
A grows to M? are discussed in [3]. The main ingredient for consistency is to ensure that 



{z* , 9*) is the unique solution to the optimisation problem in (3.2) when A is large enough. 
This identifiability is not easy to check. Some criteria are given in j3]. If p is the number 
of parameters to estimate, it turns out that choosing K = p test functions may be not 
sufficient to guaranty identifiability. K > then identifiability is in general achieved. 

In the setting of Quermass model, it is in general not possible to compute C^^{u};f) 
given the observation U^j H A. Consider for instance the pseudo-likelihood estimator. It is 
a particular case of the Takacs Fiksel procedure where fk = dh^{{x, R),oj) /d9k, k = 1,2,3 
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and /4 = Xjz (see section 3.2 in [3]). In this case the computation of (3.1) requires 



the observation of all marked points in wa and so the pseudo-hkehhood procedure is not 
feasible. However, it is possible to find some test functions / such that 

1. for all X S A and R G [0, i?o]i fiix,R),uj) is computable, making the integral term 



in (3.1 ) calculable. 



2. the sum term in (3.1 ) is computable from the observation of U^j H A only. 



Some examples are provided below. 

Note finally that for any x £ A and any R G [0, Rq] the computation of the local energy 
h^{{x,R),uj) is always possible (up to some possibly edge effects discussed in Section 4.3 1 . 
Indeed, denoting l/({x, R) = ti^^ H B{x, R), the additivity of A, C and x implies that 



h\{x,R),u) 



ttR^ - AiU{x,R)) 



2itR- C{Vl{x,R)) 



x{U{x,R)) . 

(3.3) 



3.2.2 Some well-adapted test functions 



In this section, we introduce some test functions / that make (w; /) in (3.1 ) computable 
from the observation oiUi^OA only. This provides a solution for estimation in the Quermass 
model in spite of its unobservability issue. We restrict our presentation here to the case 
u = WA, or equivalently A' = A, meaning that the observation window coincides with 
the domain of definition of p^*'^*. The general case involves some edge effects which are 
discussed in Section [4.31 

Let us first consider the test function /q defined by 

/o((x,i?),^) = Length(5(x,i?)n(^/,,)'=), (3.4) 

where S{x,R) is the sphere with centre x and radius R. The quantity /o((x, i?), cj) is 
actually the length of arcs from the sphere S{x,R) which are outside Ui^. Although the 
quantity fo{{x, R),uj\{x, R)) is not observable for any {x,R) in lo\, the sum over (x,R) G 
u}\ is nothing else than the total perimeter of U^jj^ : 



foi{x,R),u;\ix,R)) = CiU^ 



(3.5) 



Moreover, it is possible to compute fQ{{x,R),u}) for any x G A and R G [0,i?o]j making 



the integral in (3.1) calculable. It follows that C^^ {u; Jq) is computable. 



Similarly for any a > 0, we consider the test function 

fa{{x, R),uj) = Length (5(x, R + a)n{U^® B{0, q))^) , (3.6) 
where Uui © -6(0, a) denotes the a-parallel set of U^^ defined by the Minkowski sum of Uuj 



and B{0,a). In this case (3.1) is also computable and the sum therein reduces to the 
perimeter of the a-parallel set: 



^ Uix,R),oj\{x,r)) = CiU^^eB{0,a)). 



(3.7) 
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In the same spirit as (3.5 ), it is natural to look for some test functions /area and f^p such 
that the sum in (3.1 ) respectively reduces to ^(Z^/oja) ^^'^ xI^i^a)^ ^^^'^ other hand the 



integral term in (3.1) is calculable. Unfortunately, we have not found any. Nevertheless, 

/sum = ^ ^ fai ) 



considering the test function 



(3.8) 



where the sum is done over some suitable finite set of non negative a^'s, we obtain 

^ fsumi{x,R),oj\ix,r)) = Y,C{U^^eBiO,ai)), (3.9) 



which is closely related to the area of the complementary of Ui^^ in A, see Figure [2] For 
this reason, /sum can be viewed as a substitute for /area- 






Figure 2: Left: iY^^ in gray, as the union of the balls bordered with dashed lines; the length of 



the solid black line is the quantity (3.5). Middle: same 14^^^ in gray; the length of the solid black 



line is the quantity (3.7) for some a > 0. Right: same U^j,^ in gray; the sum of lengths of the solid 



black lines is the quantity (3.9) for 7 a^'s 



Let us finally introduce the test function /jgo which indicates, for any (x,i?) G wa, 
whether B{x,R) is an isolated ball in lAi^: 



fiso{{x,R),Uj) 



1 if S{x, R) n 
otherwise. 



(3.10) 



In this case, YIxi^lua fiso{{x, R),uj\{x, R)) is the number of isolated balls in ^i^a- Let us 
note that an isolated ball can contain smaller balls completely included inside it. 

Remark 3.1. The test functions Jq, fa, fsum, fiso introduced above satisfy all the regularity 
conditions assumed in /Si, which implies consistency and asymptotic normality of associated 



TF estimators, provided identifiability holds (see Section 4-4^ 



4 Practical aspects of TF estimation 
4.1 Computation of the contrast function 



The TF estimator in (3.2) requires to compute ("^i /) in (3.1) for the different well- 



adapted test functions / introduced in Section 3.2.2[ Note that Section 4.3 explains how 
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handling edge effects in tlie computation of (3.1 ), so we don't adress this question in this 
section. 



The integral term in (3.1 ) can be approximated by a Monte Carlo approach, based on N 
independent points xi, . . . ,xn uniformly distributed on A and independent realisations 
Ri, . . . , Rn from /x. This leads to the approximation 

1 ^ 

Cf (a;;/) ^ f{{x,R),u\{x,R)) - - ^ z e-'^'«--«')'-)/((x„ i?,), a;). (4.1) 



Let us focus on the computation of (4.1) for the test functions /o, /«, /sum and some 
fixed values of z and 9. The first sum above reduces respectively to the perimeter of U^jj^, 
the perimeter of the a-parallel set of Uu)j^, or the sum of such perimeters. The construction 
of a-parallel sets of ^^ja ^he calculation of the above-mentionned perimeters may 
be achieved using some mathematical morphology tools, see e.g. |22], |20) . |26| . The 



second sum in ( |4.1[ ) involves the local energy of each new ball (xj, Ri), given by (3.3), and 
f{{xi,Ri),uj), for / = /o,/a,/sum- While for each i we can use again some mathematical 
morphology tools to calculate these two terms, the computation for the whole sum, which 
requires to repeat times these calculations, may become difficult to implement. 



An alternative procedure to compute (4.1 ) is to approximate the observed set UojHA by 



any union of balls (e.g. by the procedure described in |29j) and to consider the associated 
power tessellation (see |17|). Then the construction of approximated a-parallel sets ofUi^^ 
becomes straightforward since it suffices to increase the radius of each ball by a. Moreover, 
the calculation of the above-mentionned perimeters, the calculation of the local energy of 
any new ball (xj,i?i), and the calculation of f{{xi,Ri),Ld), for / = fo, fa, fsnm, are easily 
deduced from the associated power tesselations as described in |17) . 

We emphasize that the above procedure does not depend on the union of balls used to 
approximate Z//^nA. In order to minimize the computational complexity, an approximation 
involving a low number of balls is therefore preferable. Moreover, this union of balls is only 
used for computational reasons and has nothing to play with the germ-grain representation 
of that we aim at fitting. In particular, the number of balls in this approximation is 
not relevant to estimate z* . 



Finally, let us focus on the computation of (4.1) for the test function /iso. Contray to 
/O) fa and /sum) the test function /jso is strongly related to the ball structure of the germ- 
grain model, since the ffist sum in (4.1 ) corresponds to the number of isolated balls in U^j^- 



In this sense /iso seems less natural for applications. However, as it will be demonstrated in 
Section [5| the test function /iso appears to provide a relevant information for the estimation 



issue. For this reason, it can be important to include it in (3.2). In practice, we have then 
to decide what is considered as an isolated ball. A solution can be to consider an isolated 
component of Ui^^ as an isolated ball, if its diameter is smaller than some constant chosen 
a priori. Another solution is to use the approximation of U^^ n A by a union of balls, as 
mentioned above: an isolated component can then be considered as an isolated ball if it is 
approximated by an isolated ball. 

In this work, the practical implementation of the TF procedure has been conducted 
using an approximation of H A by a union of balls and the construction of the associated 
power tesselation. 
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4.2 Minimisation of the contrast function 



As described in section |3.2.2[ many computable test functions are available to implement 
the TF procedure for the Quermass model: /o, fa for any a > 0, /sunn /iso- 

In order to estimate p unknown parameters among z* , 0^, 02 and ^g, the law of the 



radii fj, being fixed, it suffices to choose K > p test functions as above and to solve (3.2 1 
where 9 = (01,02,^3)- 



The TF optimization (3.2) in z can be done explicitely. We deduce that the solution 



{z,9) of (3.2) necessarily belongs to the implicit manifold z = z{6) where 



(4.2) 



Sk=Yl fki{x,R),oj\ix,R)) 



and 



hiO) 



Bo 



fki{x,R),u))dx fj.{dR). 



Therefore, in practice, 6 is first estimated by the solution 9 of (3.2) where z is replaced by 
z{9), i.e. 



K 



9 = argmmJ2{Sk-mh{0)y 



k=l 



This solution may be obtained by a grid search optimization procedure. Then z is estimated 
by z = ~z{e). 



Note that the practical computation of all terms involved in (4.2) and (3.2) can be 



conducted as explained in Section 4.1 



This procedure allows to consider several estimators, depending on the number and 



the choice of test functions used in (3.2). It is not easy to find an optimal choice. The 



asymptotic variance of TF estimators is known (cf [3]), but appears intractable to be 
optimized. Some simulations are thus mandatory to provide some recommendations, see 
Section [5] 

4.3 Edge effects 

Based on the observation o^h^^f^K^ two types of edge effects may occur in the computation 



of (3.1) or (4.1). 



First, edge effects may occur in the integral term in (3.1), or the second sum in (4.1), 
for the computation of h*^ {{x , R) , io) and f{{x,R),uj), for / = fo, fa, fsum, or /iso, when 



X is close to the boundary of A. In view of (3.3), (3.4), (3.6), (3.8), ( |3.10 ), this type of 
edge effects does not occur for the marked points (x, R) such that B(x, i?max) C A, where 
Rmax = Rq + maxj Oj. Therefore, in practice, this first type of edge effects can be avoided 
by considering the estimation on A , where A denotes the eroded set of A by i?niax- 

The 



estimators are then defined by (3.2) where A is replaced by A 
/ 



Second, edge effects can occur for the first sum term in (3.1 ) and (4.1 
= /o- If w = UA, as it was assumed for simplicity in Section 3.2.2 



Let us consider 



then from (3.5 ) 
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this sum term reduces to Cipl^f^). As we only observe n A, an approximation in this 
case could be C{U^ n A) and we have CiUui H A) < C(Uujj^)-, where the equahty occurs 
if and only if lAui C A. In the general case when oj 7^ waj the equality (3.5) does not 



hold any more and some extra terms due to edge effects have to be added in the right 



hand side of (3.5 1 . Furthermore, to avoid the first type of edge effects, the estimation 
procedure is implemented on the eroded domain A~. In this general case we may also use 
the approximation 

fo{{x,R),uj\{x,R))^/:{U^nA-), (4.3) 



where the last perimeter can be computing by some mathematical morphology tools. Al- 
ternatively, n A can be approximated by a union of balls, as suggested in Section 4.1 
namely for some integer n, U^^ n A ~ UiLi ^iVij 
use the approximation 



where G A and > 0, and we can 



hUx,R),w\{x.R)) 



E 



Length S{y., 



n 



n N 



(4.4) 



This last sum can easily be implemented from the power tesselation based on IJILi -^(y*' '''«)• 
Note that error of approximations in (4.3) and (4.4) involve only edge effects that 
become negligible when A tends to M?. Finally similar approximations can be used for 
f = fa, /sum or /iso instead of / = /o and we omit the details. 



In this work, we have decided to handle edge effects by working in (3.2 ) with the eroded 
domain A~ instead of A, and by using the approximation (4.4), which is in agreement with 



our choice made to compute easily in practice the contrast terms (4.1) 



4.4 Identifiability 

The consistency of the TF procedure crucially depends on an identifiability assumption 



which basically implies that the contrast function in (3.2) has a unique minimum when 
A tends to M? (see [3j). This assumption is in general satisfied if K > p, where p is the 
number of parameters to estimate. When K = p, identifiability is not easy to check, but 
has been proved to hold in some cases (see Example 2 in [3]). 

Nevertheless, these theoretical considerations only ensure asymptotic identifiability. In 
practice, where A 7^ M^, several local minima of the contrast function in (3.2) may arise 



and the global minimum might lead to an improper estimation. To avoid such drawback, 
a solution is to consider several TF contrast functions, coming from various choice of test 
functions. They should all share a local minimum in the same region, which allows us to 
restrict the domain of optimization. 



5 Simulations 

5.1 Estimation of the intensity parameter z* 

Let us first assume that Of, O^, ^3 are known and let us focus on the estimation of z* , which 



appears as the main challenge for the Quermass model. In this setting, the equation (4.2) 
with 6 = 6* provides an explicit estimator of z*. The computation of (4.2) is conducted 



12 



as explained in Section [4] we focus on an eroded domain to prevent edge effects and the 
Monte Carlo approximation (4.1 ) is applied, where we choose N = 2500. 

More specifically, we consider the estimators zq, Zai, • • • , Zaio, ^iso and isum, defined 
by ( |42| where K = 1 and where is respectively /o, fa^, • • • , /aio, /iso and /sum- The 
latest test function /sum is defined by the sum (3.8) over the ten previous Oj, i = 1, . . . , 10. 

Some results for the estimation of z* are displayed in Figure |3] based on 100 replicates 
of the Area process {9^ = 0, 9*^ = 0) on [0,50]^, when z* = 0.1, 91 = 0.2 and ^ is 
the uniform law on [.5,2]. The left plot is concerned with large parallel sets: = i/5, 
i = 1, ... ,10, while the right plot involves small parallel sets: ai = i/50, i = 1, . . . , 10. 

Figure [3] confirms that our procedure allows to estimate z in presence of the unobserv- 
ability issue. Moreover, it appears that small parallel sets seem to provide better estimates. 
The same conclusion holds from intensive estimations of z* for other values of 9^, ^2 and 
^3 (omitted in this article). 




Figure 3: Estimations of z* = 0.1 from 100 replicates of an Area process {9^ = 0.2). Left: 
boxplots of zq, Za^ {ai = i/5), Ziso and Zsum- Right: same boxplots but with Ui = i/50. 



5.2 Estimation for the A, C and x process 

In this section, we consider estimation in the Area process = ^3 = 0), in the Perimeter 
process (0* = ^3 = 0) and in the Euler-Poincare process (0* = 92 = 0). In these cases, 
two parameters have to be estimated: the interacting parameter {91, ^2 or 9^) and the 
intensity parameter z* . 

Many TF estimators are conceivable. For instance, for the Area process, the estimation 
of (z* , 91) can be achieved by the following TF estimators with K = 2: (ziso, ^iso) based on 
(/0)/iso)i (^ai^a) based on {fo, fa) for some a > 0; (^sumj^sum 

) based on (/o,/sum) where 



the sum (3.8) is done over ten Qj's. It is also possible to consider more test functions, as 
for example with K = 11: (ialh ^all) based on /q, fai, ■ ■ ■ , faw for ten different a^'s. 

Some simulations (omitted here) show that the estimations {za,9a) may be very vari- 
able depending on a for a given observation lA^^^, even when using small parallel sets as 
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suggested in section 5.1 As far as we do not know which estimator is the most efficient, 
we prefer to use estimators that combine several small parallel sets. For this reason, we 
will concentrate on (zsum,^sum) and (zaii, 6'aii). 

Another natural way to combine information coming from various parallel sets is to 
consider an aggregated estimator. How to aggregate several estimators is an interesting 
but difficult question, particularly in presence of strong dependent observations as in the 
Quermass model. We simply consider here the median of several estimators, that provides 
a natural robust combination: 

) = median . . . , (zq^q, ^^lo))- 

Therefore, the following simulations focus on the three estimators: (-Ssunn ^sum)i (^medi ^med) 
and (^alb^all)) where ai = i/50, i = 1, ... ,10. Moreover, since the test function /iso plays 
a particular role as it is stronlgy related to the ball structure of the germ-grain model, we 
also assess the performance of (iisoj^iso); based on (/o,/iso)- 

All simulations are done on [0,50]^. The law of radii /u is uniform on [.5,2]. The 
practical implementation is conducted as explained in Section [4j that includes erosion and 
the Monte Carlo approximation (4.1). Figure [l] displays the estimation results on 100 



replicates of, respectively: the Area process with z* = 0.1, 6* = 0.2; the Perimeter process 
with z* = 0.2, 02 = 0.4; and the Euler-Poincare process with z* = 0.1, ^3 = 1. 

As a ffi'st conclusion, we see from these simulations that all estimators considered in 
these simulations allow to estimate the two parameters and perform more or less equiv- 
alently. However, (ialb^all) seems to slightly surpass the other estimators, in terms of 
dispersion and outliers. 

5.3 Estimation for the full Quermass process 

To deal with the estimation of more interacting parameters, we proceed according to the 
previous conclusion. Given ten small a^'s, we consider the TF estimator based on all 
available test functions /o, fai, /aio) /sum and /iso- Since /iso may be artificial to 



compute in practice (see Section 4.1), we also consider the TF estimator based on all 
previous test functions except /iso. 

Some results of estimation are presented in Figure [6] for 100 replicates of the {A,jC)- 
process ((9^ = 0) on [0,50]^, with z* = 0.1, 9^ = -0.2, 6^ = 0.3 and the law of radii is 
uniform on [.5, 2]. The same kind of results are displayed in Figure [7] for the full Quermass 
process with z* = 0.1, = —0.2, 62 = 0.3 and ^3 = —1. See some examples of samples in 
Figure [5] 

While the quality of estimation turns to be satisfactory in presence of one interacting 



parameter (i.e. for the ^-process, the £-process or the iS-process) as seen in Section 5.2 
it naturally decreases in presence of more interacting parameters (Figure |6]), to become 
very variable for the full Quermass model (Figure [7]). Moreover, Figure |6] shows that the 
presence of the test function /iso seems to improve the estimation, even if its computation 
in practice might appear artificial. 
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0.04 0.06 0.08 0.10 0.12 0.14 0.16 Z,^ Zgn Z^ed Zsum 9iso Qall 9med Ki 



Figure 4: Estimations from 100 replicates of, from top to bottom: the ^-process (z* = 
0.1, ei = 0.2); the ^-process {z* = 0.2, 9^ = 0.4); the x-process {z* = 0.1, 6^3 = 1). 
The estimators are (ziso,^iso) (green), (ialh^all) (red), (imed,^med) (black) and (isum,<9sum) 
(blue). From left to right: scatterplot of {z,9); boxplots of z; boxplots of 6. 
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Figure 5: Samples of the £)-process with z* = 0.1, 61 = -0.2 and ^| = 0.3 (left) and of the 
Qucrmass process with z* = 0.1, 91 = -0.2, 6*2 = 0.3 and ^3 = -1 (right). The window is [0,50]^ 
and the radii are uniformly distributed on [.5,2]^. 
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With iso Without iso With iso Without iso With iso Without iso 



Figure 6: Estimation of z* (left), 61 (middle) and $2 (right) from 100 replicates of the 
(.4., >C)-process with z* = 0.1, 9^ = —0.2 and 02 = 0.3. In each plot, the estimator involves 
/iso (on the left) or not (on the right). 




With iso Without Iso WItti Iso WIttiout Iso WItti Iso WIttiout Iso With Iso Without Iso 

Figure 7: Estimation (from left to right) of z* , 61, 02 and 6^ from 100 replicates of the 
Quermass process with z* = 0.1, 9^ = —0.2, 62 = 0.3 and ^3 = —1. In each plot, the 
estimator involves /iso (on the left) or not (on the right). 
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6 Application to heather data 



The plot on the left hand of Figure [8] shows a binary image of the presence of heather 
in a 10 X 20 m rectangular region at Jadraas, Sweden. This heather dataset has been 
widely studied. It was first presented by P. Diggle in |6J, where it has been modelled by 
a stationary spherical Boolean model. This point of view has been considered further in 
im [T2] and in ||4j , where alternative estimation methods for the Boolean model have been 
implemented. 




Figure 8: Heather data (left) and its approximation by a union of balls (right) 

Considering heather bushes as discs, the spherical germ-grain representation of the data 
seems to be a natural approximation. However, the independance between the location 
of the grains and their radii, implied by the Boolean model, appears more questionable. 
As a matter of fact, simulations of the fitted Boolean models from the aforementioned 
studies do not look visually similar to the heather dataset, as initially observed by P. 
Diggle in j6]. This lack of fit has been confirmed in fTSj by various diagnostic plots (as 
in Figure |9]). The same conclusion is made in jl^, where a Matern's cluster model is 
proposed as an alternative to the Boolean model. However no estimation procedure for 
this Matern's cluster model is given in jl9| . Instead, a brute-force approach is used to 
choose the parameters in order to pass some diagnostic plots. While the latter model 
appears well adapted to the heather dataset, a comparison between our statistical method 
and the approach in [19] is therefore difficult and will not be conducted further. 

Alternatively, a model of interacting discs has been fitted to the heather dataset in 
|18j. The model is very similar to the Quermass model, except that in the Hamiltonian 
(2.1), the Euler characteristic is replaced by the number of connected components. This 



model has been fitted in [18j by a maximum likelihood approach as described in Section 



3.1 where three reference measures jj, have been tested. As a result, the best fitted model 
was: fj, is the uniform law on [0,0.53], z = 2.45 and the estimated Hamiltonian, denoted 

h\oja) = 4.91 AiU^^) - 1.18 C{U^^) + 2.25 MM^), (6.1) 
where Mccip^uj^) denotes the number of connected components of U^f^. 



As explained in Section 3.1, the estimation procedure used in jlQ implies some rather 



strong restriction on the parameter space. For comparison, we fit hereafter a Quermass 
model to heather data using the TF approach. 

Following the practical recommandations in Section |4j the binary image is first ap- 
proximated by a union of balls, using the procedure described in |29) . The result of this 
approximation is shown on the right plot of Figure ISl Second, we choose for the reference 
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measure the uniform law on [0.05, 0.55], which is in agreement with the final choice made 
in |18) . except that a minimal value for the radii has been fixed to 0.05, corresponding to 
the smallest radius of the spots observed in heather dataset. 

The estimation of the full Quermass model (i.e. all parameters z, ^i, 63 are un- 
known) did not give satisfactory results. This is not very surprising in view of the strong 
variability of the estimates observed in Figure [Tj The contrast function in (3.2) computed 
from the heather dataset actually exhibits in this general case a large amount of local 
minima, leading to an identifiability issue. After a deeper insight, it appears that any 
value of the intensity parameter z can somehow be compensated by some value of the area 
interaction parameter 9i. This identifiability issue between the area interaction and the 
intensity is well-known: it is already visible in the top-left plot of Figure |4] where we ob- 
serve a strong correlation between the two estimates in the ^-process. Therefore, we have 
decided to focus on a simpler model. Since the heather dataset seems not rich enough to 
distinguish the role played by the intensity parameter z and the area interaction parameter 
61, a natural choice is to fit a (i2, x)-pi'ocess. As a matter of fact, fitting a (^, i2)-process 
leads to the same kind of identifiability issue as before. This basically means that only 
one parameter becomes relevant in the (A, £)-process for the heather dataset, namely the 
perimeter interaction parameter, which is not sufficient to obtain a good fit. 

So, the estimation of the (C, x)-process has been implemented on the balls approxima- 
tions of heather dataset (right plot of Figure [8|, using the TF procedure associated to the 
test functions /q, fan • • • , faw, /sum and fiso, where Oj = 0.005 i, i = 1, . . . , 10 and where 
fi is the uniform law on [0.05, 0.55]. As a result, we have obtained z = 2.12 and 



h\u;a) = 0.14 CiK^J + 0.22 x(^c.a). 
uality of fit of (6.2 ) to heather dataset, ■ 



(6.2) 



In order to check the quality of fit of (6.2 ) to heather dataset, we propose some diagnos- 



tic plots in Figure 9 where a comparison is also provided with the fitted model (6.1 ) of |18] . 



These plots are the same as in [18] , they correspond to an estimation of the contact distribu- 
tion function (top left): H{r) = P{D < r\D > 0) where D = inf{p >0:Un B{0, p) 7^ 0}; 
the covariance function (top right) C(r) = P{x £ U,y G U, \\x — y\\ = r); and some 
shape-characterisitcs (see |24| or |18j for a definition): erosion (middle left), dilatation 
dr (middle right), opening Or (bottom left) and closing Cr (bottom right). More specifically, 
all plots in Figure [o] show the simulated 95% envelopes obtained under (6.2) in solid black 



lines, the simulated 95% envelopes obtained under ( |6.1[ ) in dashed black lines, the empirical 
estimates from heather dataset (in solid red lines) and from its balls approximation (red 
lines with circles). 



A slight better fit is observed for (6.2), in particular from the contact distribution 
function, the erosion and the closing, even if the closing is not fitted very well. This 



conclusion seems confirmed by the visual impression from samples of (6.1 ) and (6.2 ), shown 



respectively in Figure 10 and 11 next to the balls approximation of heather dataset. 
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Figure 9: From top left to bottom right: contact distribution function, covariance function, 
erosion, dilatation, opening, closing, for the empirical estimation from heather dataset 
(solid red line), the empirical estimation for the balls approximation of heather dataset 
(red line with circles), 95% envelopes under (6.2) (solid black line), 95% envelopes under 
(|6.1|) (dashed black line) 
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Figure 10: Heather data approximation (top left) and three samples from the fitted model 
(6T) 




Figure 11: Heather data approximation (top left) and three samples from the fitted model 
(6:21 



20 



Appendix: Combining MLE and TF approaches 



The MLE is known to be an efficient metliod of estimation, but does not allow to estimate 



the parameter z* in the Quermass model (see section 3.1). Moreover, the Takacs Fiksel 



procedure allows to estimate all parameters, given well-adapted test functions (see section 



3.2 1 . A natural idea is to try to combine these two approaches. Roughly speaking, we would 
like to first estimate z* by the Takacs-Fiksel method, then estimate 0* := (^1,^2)^3) 
MLE. 

Let us first recall that, given K test functions, the solution (z, 6) of the TF optimization 



procedure (3.2) necessarily belongs to the implicit manifold z = z{6) given by (4.2) (see 
section |4]) . 

On the other hand, the MLE (z, 6) solves the estimating equations 



i(x(^c.J) 



z 

E- 
E, 



?^(wA,obs) 

,obs j 
,obsJ 
X(^a;A,obs) 



(6.3) 



where a;A,obs and Ui^^^ohs respectively denote the observed point configuration and the 
observed random set on A. Of course n(a;A,obs) is not available in practice and the system 
cannot be solved. 



A way to combine the two procedures is to look for the best solution of (6.3) restricted 
to the implicit manifold z = z{9). More specifically, 6* is estimated by 

<9mix = arg min d{¥.-^^Q-^ 0[A{U^i^), C{UujJ, x(^a;A)) ' (-^(^^^A.obs), >C(iY^A,obs), x(^c 
where d is any distance function in M^, and z is estimated by Zmix = 



C^AiObs) 

(6.4) 



In practice, it is necessary to approximate the expectations in (6.4), since their close 
expression is intractable. Let us focus on the expectation of A{V(uj^). A straightfor- 
ward approach is to use Monte-Carlo approximations. Given N independent realisations 
ui, . . . jCOn of a Quermass process on A with parameters z and 6, we have, for N large 
enough. 



1 ^ 



(6.5) 



k=l 



This approximation requires to simulate a large number of Quermass realisations for each 
value of 6, which seems prohibitively time consuming in practice. An alternative approach 
is to use MCMC approximation. Let . . . , be A'^ independent realisations of a Quer- 
mass process on A with fixed parameters zq and ^o- Then from the ergodic theorem, for 
any z and 9, it is not difficult to check that for A^ large enough 



Ek=i {z/zor^^^>eM{O-0o)A{U^o)] 



(6.6) 



This approach should be quicker in practice than (6.5) because the simulation of A^ Quer- 



mass realisations has to be done only once and not for each value of 9. Unfortunately it 
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appears that the variance of the right hand side in (6.6) becomes very large when {z,9) is 



not close to (2:0,^0)1 making the convergence prohibitively slow. A solution is to change 
the initial value {zo,9o) according to {z,6), but then the procedure looses its advantages 
in comparison with (6.5). Therefore, the approximations of the expectations in (6.4), 



whatever (6.5) or (6.6 



To assess the qua^ 



I is used, appears as a strong restriction to apply this mix procedure, 
ity of the mix procedure anyway, we focus on the estimation of z* 



and d'l in the Area process. In (6.4), we simply choose the euclidean distance for d and 
the expectation is approximated by (6.5) so that 



'l,mix 



arg mm 

01 



1 ^ 



-4(Z^a;A,obs) 



k=l 



where uJkiz{9i), 9i), k = 1,. . . ,N are independent realisations of an Area process with 
parameters {z{6i), 9i). The estimations have been implemented on the same realisations 
than in Section |5.2[ see the first row of Figure |4j and with N = 250. The results are 



displayed in Figure 12 



As a conclusion, the mix procedure does not seem to perform better than the TF 



estimators considered in section 5.2 In view of the calculation time required for the 



approximations (6.5) or (6.6), we did not give preference to this procedure in this paper. 
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Figure 12: Estimations of z* (left) and (right) for the Area process as in the first row 
of Figure [4] with, in addition, Zmix and ^i^mix (hi white). 
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